Examining sensor metadata




Analyzing the nodes.csv file (metadata):



Examining other metadata files that come with dataset:

The files in the main data download directory contain sensor data and the associated meta-data that will enable parsing the sensor values.

Overview

This sensor data digest contains the following files:

These files will be described in-depth in the following sections.

Sensor Data

The sensor data file is an aggregate of all published data from the project's nodes. By published, they mean:

The data.csv.gz file is a compressed CSV with the following, but not limited to, columns:

These fields will always be provided as a header, for example:

timestamp,node_id,subsystem,sensor,parameter,value_raw,value_hrf
2017/09/09 22:12:44,001e0610ba8f,lightsense,hih4030,humidity,NA,32.18
2017/09/09 22:12:44,001e0610ba8f,lightsense,hih4030,temperature,NA,48.55
2017/09/09 22:12:44,001e0610ba8f,lightsense,ml8511,intensity,9643,NA
2017/09/09 22:12:44,001e0610ba8f,lightsense,tmp421,temperature,NA,43.81
2017/09/09 22:12:44,001e0610ba8f,metsense,hih4030,humidity,450,NA
2017/09/09 22:12:44,001e0610ba8f,metsense,htu21d,humidity,NA,41.15
2017/09/09 22:12:44,001e0610ba8f,metsense,htu21d,temperature,NA,24.1
2017/09/09 22:12:44,001e0610ba8f,metsense,metsense,id,00001814B7E8,00001814B7E8
2017/09/09 22:12:44,001e0610ba8f,metsense,pr103j2,temperature,839,NA

For the purpose of this examinination, we will use as expected the 'value_hrf' human readable data parameter values, as the 'value_raw' is a subset of that...

Sensor data is ordered by ascending timestamp.

Additional information such each node's coordinates or each sensor units can be found in the metadata. More information about these will be provided in the next two sections.

A sensor values may be marked NA, indicating that either the raw or HRF value is unavailable (at which point that data may be deleted if desired)

Node Metadata

The node metadata provides additional information about each of a project's nodes. This file is a CSV with the following fields:

These fields will always be provided as a header, for example:

node_id,project_id,vsn,address,lat,lon,description,start_timestamp,end_timestamp
001e0610ba46,AoT_Chicago,004,State St & Jackson Blvd Chicago IL,41.878377,-87.627678,AoT Chicago (S) [C],2017/10/09 00:00:00,
001e0610ba3b,AoT_Chicago,006,18th St & Lake Shore Dr Chicago IL,41.858136,-87.616055,AoT Chicago (S),2017/08/08 00:00:00,
001e0610ba8f,AoT_Chicago,00D,Cornell & 47th St Chicago IL,41.810342,-87.590228,AoT Chicago (S),2017/08/08 00:00:00,
001e0610ba16,AoT_Chicago,010,Ohio St & Grand Ave Chicago IL,41.891964,-87.611603,AoT Chicago (S) [C],2017/12/01 00:00:00,2018/06/04 00:00:00

Additional details about a node are contained in the description field. The letters inside the brackets [ ] indicate:

Sensor Metadata

The sensor metadata provides additional information about each of the sensors published by the project. This file is a CSV with the following fields:

These fields will always be provided as a header, for example:

ontology,subsystem,sensor,parameter,hrf_unit,hrf_minval,hrf_maxval,datasheet
/sensing/meteorology/pressure,metsense,bmp180,pressure,hPa,300,1100,"https://github.com/waggle-sensor/sensors/blob/master/sensors/airsense/bmp180.pdf"
/sensing/meteorology/temperature,metsense,bmp180,temperature,C,-40,125,"https://github.com/waggle-sensor/sensors/blob/master/sensors/airsense/bmp180.pdf"
/sensing/meteorology/humidity,metsense,hih4030,humidity,RH,0,100,"https://github.com/waggle-sensor/sensors/blob/master/sensors/airsense/htu4030.pdf"
/sensing/meteorology/humidity,metsense,htu21d,humidity,RH,0,100,"https://github.com/waggle-sensor/sensors/blob/master/sensors/airsense/htu21d.pdf"
/sensing/meteorology/temperature,metsense,htu21d,temperature,C,-40,125,"https://github.com/waggle-sensor/sensors/blob/master/sensors/airsense/htu21d.pdf"

More in-depth information about each sensor can be found at: https://github.com/waggle-sensor/sensors

Provenance Metadata

The provenance metadata provides additional information about the origin of this project digest. This file is a CSV with the following fields:

These fields will always be provide as a header, for example:

data_format_version,project_id,data_start_date,data_end_date,creation_date,url
1,AoT_Chicago.complete,2017/03/31 00:00:00,2018/04/10 15:34:36,2018/04/10 15:34:36,http://www.mcs.anl.gov/research/projects/waggle/downloads/datasets/AoT_Chicago.complete.latest.tar.gz

Analyzing the sensors.csv file (metadata):

image.png

image.png

image.png

image.png

NOTE: 

The Waggle Platform is an open source software and hardware platform
for intelligent sensors with advanced edge computing and support for
machine learning.  The Waggle Platform is used by several wireless
sensor projects, including the Array of Things project
(https://arrayofthings.github.io).  For more information on the Waggle
Platform, see:

See http://www.wa8.gl for details.

For details on the open source software license for Waggle components,
please see the file LICENSE.waggle.txt. Some source code files may not
have been originally authored by members of the Waggle team.  In such
cases, a small note describing the modifications for Waggle have been
added to the original copyright and license.

looking at just the particular sensor 'metsense', which we know contains the following 9 subsensors (variables):

image.png

i.e. the 'sensor' column is the 'part name' from the spec sheet, we know that HIH4030 is a humidity sensor for instance, there should be 9ish main sensors within that subsystem metsense

It is worth noting that at a later point, it may be possible to correlate rise in one sensor's values with higher temperatures (that are able to be monitored via other separate sensors). This would allow the ability to properly normalize the chemical sensor's output

image.png


HUGE AMOUNT OF DATA: 

data_start_date: 9/14/2016 0:00

data_end_date:  9/12/2021 1:20

Most data though appears to run thru most of 2018-2020, and into 2021...